Real-time statistical rules for spam detection

نویسندگان

  • Quang-Anh Tran
  • Haixin Duan
  • Xing Li
چکیده

Spam detections fall into two categories: rule-based and statistical-based. The former refers to the detection which is performed by looking for spam-liked patterns in an email. Since the rules can be shared, they have been popularized quickly. The rules, however, are built manually it is hard to keep them up with the variation of spam. The statistical-based method, on the other hand, is possible to make the detector retrained quickly, but knowledge obtained from this method is unable to be shared among the servers. We, therefore, proposed a statistical rule-based method for spam detection. A widely used rule set Chinese_rules.cf, for SpamAssassin to catch spam written in Chinese is generated by this method. It can be updated automatically and can also be shared among servers. A generating process of the Chinese_rules.cf is described. Factors that control the rule’s performance are discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network

In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...

متن کامل

Enterprise Anti-Spam Solution Based on Machine Learning Approach

Spam-detection systems based on traditional methods have several obvious disadvantages like low detection rate, necessity of regular knowledge bases’ updates, impersonal filtering rules. New intelligent methods for spam detection, which use statistical and machine learning algorithms, solve these problems successfully. But these methods are not widespread in spam filtering for enterprise-level ...

متن کامل

A Machine Learning Approach to Server-side

Spam-detection systems based on traditional methods have several obvious disadvantages like low detection rate, necessity of regular knowledge bases’ updates, impersonal filtering rules. New intelligent methods for spam detection, which use statistical and machine learning algorithms, solve these problems successfully. But these methods are not widespread in spam filtering for enterprise-level ...

متن کامل

Moving dispersion method for statistical anomaly detection in intrusion detection systems

A unified method for statistical anomaly detection in intrusion detection systems is theoretically introduced. It is based on estimating a dispersion measure of numerical or symbolic data on successive moving windows in time and finding the times when a relative change of the dispersion measure is significant. Appropriate dispersion measures, relative differences, moving windows, as well as tec...

متن کامل

Detecting E-mail Spam Using Spam Word Associations

Now-a-days, mailbox management has become a big task. A large proportion of the emails we receive are spam. These unwanted emails clog the inbox and are very ubiquitous. Here, a new technique for spam detection is presented that makes use of clustering and association rules generated by the Apriori algorithm. Vector space notation is used to represent the emails. The results obtained from exper...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006